Skip to content

feat(vllm-tensorizer): Bump vLLM to v0.20.2 on CUDA 13.2 / Ubuntu 24.04#160

Merged
JustinPerlman merged 13 commits into
mainfrom
jperlman/vllm0.20.2
May 15, 2026
Merged

feat(vllm-tensorizer): Bump vLLM to v0.20.2 on CUDA 13.2 / Ubuntu 24.04#160
JustinPerlman merged 13 commits into
mainfrom
jperlman/vllm0.20.2

Conversation

@JustinPerlman
Copy link
Copy Markdown
Contributor

@JustinPerlman JustinPerlman commented May 12, 2026

Summary

  • Bump vLLM to v0.20.2
  • Add a build matrix producing two variants, both on Ubuntu 24.04 / torch 2.11.0:
    • v0.20.2-cuda13.2.1-ubuntu24.04
    • v0.20.2-cuda12.9.1-ubuntu24.04

Ubuntu 24.04 compatibility fixes

  • Remove python3-pip from apt in builder-base and add rm -f /usr/lib/python3.*/EXTERNALLY-MANAGED before pip bootstrap — on Ubuntu 24.04, apt-installed pip has no RECORD file and blocks pip self-upgrade
  • Purge python3-jwt in the final base stage before pip installs — same root cause: Debian-managed PyJWT has no RECORD file and blocks vLLM's dependency resolution
  • Fix cuda-python version spec from ~=${CUDA_VERSION} to ~=${CUDA_VERSION%.*} — patch-level CUDA versions (e.g. 13.2.1) don't match available cuda-python releases; strip to major.minor
  • Install wheel package in lmcache-builder and restore it to builder-base pip install

Relevant information: vllm-project/vllm@6c964bd

@JustinPerlman JustinPerlman self-assigned this May 12, 2026
@JustinPerlman JustinPerlman requested a review from a team as a code owner May 12, 2026 19:30
@github-actions
Copy link
Copy Markdown

@JustinPerlman Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/25751418629
Image: ghcr.io/coreweave/ml-containers/vllm-tensorizer:jperlman-vllm0.20.2-cc65ad3-v0.20.2

@JustinPerlman JustinPerlman requested review from abatilo and ritazh May 12, 2026 19:47
Copy link
Copy Markdown
Contributor

@abatilo abatilo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me but I'd feel better if @Eta0 could take a peek

@JustinPerlman
Copy link
Copy Markdown
Contributor Author

JustinPerlman commented May 12, 2026

This seems reasonable to me but I'd feel better if @Eta0 could take a peek

Fair enough lol

@JustinPerlman JustinPerlman requested a review from Eta0 May 12, 2026 19:51
@alexeldeib
Copy link
Copy Markdown
Contributor

Pure 13.2, no matrix with 12.9? 🫣I would really like having both options…if it’s a giant pain on vllm side it’s fine, but I think you then need to validate this actually works on b40/rtxp6000 with latest supported/installed drivers cw ships

@alexeldeib
Copy link
Copy Markdown
Contributor

I am still not aware of a cuda + driver combo that has decent support and works as expected, but haven’t followed too closely lately

@github-actions
Copy link
Copy Markdown

@JustinPerlman Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/25919982852
Image: ghcr.io/coreweave/ml-containers/vllm-tensorizer:jperlman-vllm0.20.2-202ef09-v0.20.2-cuda13.2.1-ubuntu24.04

@github-actions
Copy link
Copy Markdown

@JustinPerlman Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/25919982852
Image: ghcr.io/coreweave/ml-containers/vllm-tensorizer:jperlman-vllm0.20.2-202ef09-v0.20.2-cuda12.9.1-ubuntu24.04

Copy link
Copy Markdown
Contributor

@alexeldeib alexeldeib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems sane

@JustinPerlman JustinPerlman merged commit 287015d into main May 15, 2026
9 checks passed
@JustinPerlman JustinPerlman deleted the jperlman/vllm0.20.2 branch May 15, 2026 15:53
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally suggest to not repeat yourself as much in this config file and to construct more parts of this dynamically, like the tag suffix, but not a hard requirement.

Comment on lines +27 to +28
rm -f /usr/lib/python3.*/EXTERNALLY-MANAGED && \
python3 -m pip install -U --no-cache-dir pip packaging 'setuptools>=77.0.3,<81.0.0' wheel setuptools_scm regex build
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm -f /usr/lib/python3.*/EXTERNALLY-MANAGED and pip/setuptools installation and upgrading is already handled by the torch image, so you don't need to repeat those bits here.

apt-get install -y --no-install-recommends curl libsodium23 libnuma-dev && \
apt-get purge -y python3-jwt && \
apt-get clean && \
rm -f /usr/lib/python3.*/EXTERNALLY-MANAGED
Copy link
Copy Markdown
Collaborator

@Eta0 Eta0 May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before: this rm is already handled by the base image.

RUN apt-get -qq update && apt-get install -y --no-install-recommends curl libsodium23 libnuma-dev && apt-get clean
RUN apt-get -qq update && \
apt-get install -y --no-install-recommends curl libsodium23 libnuma-dev && \
apt-get purge -y python3-jwt && \
Copy link
Copy Markdown
Collaborator

@Eta0 Eta0 May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's that apt-get purge -y python3-jwt for? 👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants